Macho-Nice Guy Dichotomy

Author

Gilad Sarusi

1 HistWords repo

For our purposes, I use the same approach chosen by Tessa E. S. Charlesworth and colleagues—namely, the HistWords repository, which is available on GitHub.

1.0.1 What’s going on?

HistWords is an open-source dataset and toolkit that provides word embeddings for different historical periods, trained on Google Books Ngram (and other corpora). Each decade (1800–1990) has its own 300-dimensional vector space representing the meaning of words in that era.

Think of it as a longitudinal semantic dataset—rather than tracking people over time, we track words across decades. Each vector encodes a word’s position in the shared “meaning space” of its time.

1.0.1.1 What can we do with it?

  • Inspect meaning: find the nearest neighbors of a word in any decade.

  • Compare meanings over time: compute cosine similarity for the same word across decades to track semantic drift.

  • Study cultural change: analyze how a word’s associations (e.g., macho, nice) evolve by decade.

  • Compose phrases: average vectors (e.g., macho + guy) to represent multi-word concepts.

  • Align embeddings: apply sequential Procrustes alignment to make spaces from different decades comparable.

1.0.2 What are we comparing?

In this project, I adopt an approach similar to Tessa et al. designed to avoid the alignment problem inherent in diachronic word embeddings—which makes direct comparison of raw cosine similarities between two embedding spaces (e.g., the 1910s and the 1920s) invalid.

Instead of directly comparing raw cosine similarities of words across decades, I examine within-decade associations—for example, comparing how strongly macho and nice are each associated with the words in a given decade.

The difference between these associations is then converted into a standardized effect size, which serves as the unit of comparison across time. This method lets me trace semantic or attitudinal change without requiring the embedding spaces to be aligned.

  • Across time (overall)—I identify the top 10 and top 50 word and trait associates that are most strongly (and relatively) associated with one type of man (and least with the comparison group) by:

    1. computing cosine similarity of all words and traits to each type in each decade (e.g., the cosine similarities of all traits to macho man in 1800). After inspecting distances, I test whether there is a significant difference between the two types—i.e., whether the descriptors are semantically distinct from those of nice guy—as a sanity check; and

    2. calculating the difference in MAC scores; Macho (MAC) − Nice (MAC) identifies words/traits most associated with Macho, and reversing the subtraction captures the opposite pattern. In short, across time, which words and traits are most associated with Macho/Nice Guy?

  • Across time (by decade)—I examine the top trait and word associates for each decade:

    1. compute decade-specific cosine-similarity difference scores and rank words and traits within each decade; and

    2. compute the average, decade-wise valence of the top associates by averaging ratings of those words’ positivity/negativity in each of the 20 decades.

Nevertheless, following Tessa et al.’s guidance, applying orthogonal Procrustes rotation can still be useful as a robustness check or for future extensions involving cross-decade word-trajectory analyses.

The following code is written using the ds4psych: Data Science for Psychology: Natural Language workflow (https://ds4psych.com). Therefore, the toolset used here differs from Tessa’s.

Notably, in the 2000s there was a change in N-gram sampling. Therefore, embeddings using the corpus from the 2000s onward are excluded.

## Words norm - for valence computation: 


warriner_path <- "wordstim/allwrdnorms_warriner.csv"          # columns: word, valence, arousal, dominance - will be used as  valence indicates the average valence rating on a 9-point scale from 1 (very positive) to 9 (very negative), obtained from Warriner and colleagues (2013) 

As an alternative, we might consider NRC VAD (S. M. Mohammad, 2018), which contains 20,007 words with ratings between 0 and 1 for valence, arousal, and dominance.

## List of traits:

traits_path <- "wordstim/traitlist.txt"

# We would latter on create a file storing those words for the code to be more elegant :

nice_guy_words <- c(
  "kind", "gentle", "caring", "sensitive", "considerate", "respectful",
  "supportive", "understanding", "emotional", "warm", "affectionate",
  "polite", "reliable", "faithful", "honest", "loyal", "thoughtful",
  "attentive", "humble", "nice","whipped", "romantic"
)
macho_guy_words <- c(
   "dominant", "assertive", "confident", "tough", "strong", "decisive",
  "bold", "powerful", "competitive", "aggressive", "independent", "unemotional",
  "fearless","alpha", "leader", "charismatic", "commanding", "cocky", "ambitious", "macho"
)

1.0.3 Pay attention!

To ensure that words are embedded in the context of masculine traits, I examine two approaches. The first is to inject male anchors. This should help position each word group within a masculine semantic field rather than in a generic trait context (traits that could appear in feminine contexts as well). This is implemented in the Concepts chunk.

  • Cons / watch-outs:
    Anchor dominance risk—male anchors can outweigh trait terms if the lists are small. Moreover, shared anchors in both groups reduce the distance between group centroids (smaller margins).
    Monitor: anchor share in each centroid (≤30–40%), cosine margin between group means, and Top-N stability across decades (overlap %).

The second approach, implemented in the next Helpers chunk, is a masculinity projection nudge. I construct a male–female axis and then “masculinize” any term by nudging it toward the male direction before scoring. This gently steers polysemous words toward the male sense without hard-wiring “man” as a single axis for all items.

  • Cons / watch-outs:
    Axis quality—if the gender sets are unbalanced or drift strongly by decade, the direction can be noisy.
    Leakage—over-weighting (large w) can wash out original semantics.
    Monitor: axis stability across decades (cosine between decade-specific dmale), use small w (e.g., 0.1–0.3) and run sensitivity analyses; compare results with and without the tweak.

A comparison of both techniques is needed to validate results and serve as a robustness check.

# --- 1) Small helpers ---------------------------------------

# Read a single decade embedding file into a matrix with rownames = words

read_embeddings <- function(path) {
  # HistWords format: first line header with counts; thereafter: word + 300 dims
  
  raw <- vroom(path, col_names = FALSE, progress = FALSE, delim = " ")
  # Some files include the first line as counts; detect and drop if so
  if (nchar(raw$X1[1]) > 0 && !is.na(as.numeric(raw$X1[1]))) {
    # first line looks numeric -> drop it
    raw <- raw[-1, ]
  }
  words <- raw$X1
  M <- as.matrix(raw[,-1])
  rownames(M) <- words
  storage.mode(M) <- "double"
  M
}

1.0.4 Masculinity projection nudge

As noted earlier, to interpret traits within a masculine frame, we apply a Masculinity Nudge: we construct a female–male axis in the embedding space and add a small component of this axis to each construct vector, gently shifting it toward the male semantic field so that subsequent comparisons reflect the words’ masculine sense.

# E: numeric matrix (rows = tokens, cols = dims), 
#   rownames(E) are tokens - We should make sure that out embedding model's architecture is built in this manner.
# male_tokens / female_tokens: anchors for the axis (can leave defaults)

 male_anchors <-  c("man","men","male","guy")  #Question: should we add 'boy'? 
 female_anchors <-  c("woman", "women", "female","girl")

 
 build_male_female_axis <- function(E, male_tokens, female_tokens) {
  avail <- rownames(E)
  m <- intersect(tolower(male_tokens),   avail)
  f <- intersect(tolower(female_tokens), avail)
  if (length(m) == 0 || length(f) == 0)
    stop("Insufficient anchors to build gender axis.")
  
  male_mu   <- colMeans(E[m, , drop = FALSE])
  female_mu <- colMeans(E[f, , drop = FALSE])
  axis <- male_mu - female_mu
  axis / sqrt(sum(axis * axis)) # we want the gender axis to represent direction only, not magnitude. this normalization step is meant to make the axis to be of a unit length
 }

# DDR-based masculinity function:

# w = nudge strength (0 = no nudge, >0 = push toward masculine direction)
 
# default is set to 1 to inteperate results easily and to make sure that the wors be are assigning is regards to word_GUY not that word in general
 
 
masculinity_nudge <- function(ddr_vec, # Or any dim1:dim300 embedding object
                              E, # The embedding space 
                              male_tokens = male_anchors,
                              female_tokens = female_anchors,
                              w = 1) {
  
  # ensure DDR is a numeric vector
  if (is.matrix(ddr_vec)) ddr_vec <- as.numeric(ddr_vec)
  
  axis <- build_male_female_axis(E, male_tokens, female_tokens)
  ddr_vec <- ddr_vec / sqrt(sum(ddr_vec * ddr_vec))
  
  # apply nudge: shift toward masculine axis 
  ddr_vec_nudged <- ddr_vec + w * axis
  
  #re normalize  - Consult Almog if necessary 
  ddr_vec_nudged <- ddr_vec_nudged / sqrt(sum(ddr_vec_nudged * ddr_vec_nudged))
  
  ddr_vec_nudged
}


# From df to embedding Object function- to fit 'embedplyr' workflow:

to_embeddings <- function(df){
  # keep available rows, ensure rownames are tokens
  stopifnot(is.data.frame(df), !is.null(rownames(df)))
  
  # Keep only available words - 
  keep <- df$V1 != 0
  
  M <- as.matrix(df[keep, , drop = FALSE])
  # give it the right class so embedplyr methods (predict/emb/find_nearest) work
  class(M) <- c("embeddings", class(M))
  attr(M, "normalized") <- FALSE  # optional meta; not required
  
  # crucial: token_index environment (token -> row index) - since the words are stored as rownames
  idx_env <- new.env(parent = emptyenv())
  toks <- rownames(M)
  for (j in seq_along(toks)) assign(toks[j], j, envir = idx_env)
  attr(M, "token_index") <- idx_env

  M
}

The second approach—the Inject approach—as noted above, adds male anchor words to the word banks so that the average embedding of each list is drawn toward a masculine context.

1.0.5 ENGall Model

Properties of ENGall_model:

  • n of words: 100,000 (same word list across all decades). the authors only computed embeddings for words that were among the top 100,000 most frequent words over all time (for EngAll)
    • n of dimensions: 300.
    • Type of Embedding : SGNS.
    • Context Window : symmetric context window of 4 words.
## ----------- ENGall model -----------

 load(ENGall_dir)
#the load(ENGall_dir) creates an 'wordvecs.dat' object of which I want to store in more intuituve object name 
ENGall_model <- wordvecs.dat
length(ENGall_model)
[1] 20

The ENGall model provides embeddings for the 100,000 most frequent words across all time. This means words coined later in history will be missing in earlier slices, and some words will be absent in particular decade models. The model is stored as a list of 20 elements, one per decade. In some decade data frames, certain words have no embedding.

let’s map those words.

#### ---- Check unavailable words by decade ----
n_avwords <- vector()
n_unav<- vector()
prop_avail <- vector()
n_total <- vector()

# Creating a list of available word and unavailable words: 

for (i in seq_along(ENGall_model)) {
  # Count available words as those with V1 != 0
  n_avwords[i] <- sum(ENGall_model[[i]]$V1 != 0)
  n_total[i] <- ENGall_model[[i]]|>
      nrow()  
  n_unav[i]  <- n_total[i] - n_avwords[i]
  prop_avail[i] <- n_avwords[i] / n_total[i]
}


avwords_n_decade_df <- data.frame(
  decade      = decades,
  n_avwords   = n_avwords,
  n_unavwords = n_unav,
  prop_avail  = prop_avail
)
kable(avwords_n_decade_df |>
        #adding comma
        mutate(across(where(is.numeric),scales::comma)),
      digits = 3,
      caption = "Proportion of Words with Embedding")
Proportion of Words with Embedding
decade n_avwords n_unavwords prop_avail
1,800 13,045 86,955 0.1304
1,810 15,771 84,229 0.1577
1,820 20,312 79,688 0.2031
1,830 21,691 78,309 0.2169
1,840 23,818 76,182 0.2382
1,850 29,035 70,965 0.2904
1,860 27,191 72,809 0.2719
1,870 29,320 70,680 0.2932
1,880 34,081 65,919 0.3408
1,890 37,729 62,271 0.3773
1,900 41,551 58,449 0.4155
1,910 36,553 63,447 0.3655
1,920 35,643 64,357 0.3564
1,930 34,477 65,523 0.3448
1,940 34,226 65,774 0.3423
1,950 41,807 58,193 0.4181
1,960 54,332 45,668 0.5433
1,970 60,344 39,656 0.6034
1,980 64,934 35,066 0.6493
1,990 71,097 28,903 0.7110

Out of 100,000 words, each decade has on average 36,347 available words.

Notably, the 1990s slice has ~71% of words available. This does not mean “29% are post-1990 words.” Rather, 29% of the top-100k (over 1800–1999) simply lack enough occurrences in the 1990s to train vectors—often because they are earlier-era words that faded out, or because they are rare or orthographically different in that decade.

Later, we will pull valence from Warriner’s list (~14,000 words), which bounds our effective vocabulary to those items. Because not all Warriner words appear in ENGall, the intersection may be smaller.

Another concern is whether some of the words used to define our constructs are missing from the model.

Let’s check!

### ---- Check if macho/nice words by decade ------- 
# 1) Availability (token x decade)

avail <- map2_df(ENGall_model, decades, ~
  tibble(
    decade    = .y,
    token     = tolower(rownames(.x)),
    available = .x$V1 != 0
  )
)

#  Define the sets you care about
sets <- list(
  macho = macho_guy_words,
  nice  = nice_guy_words,
  male = male_anchors
)

wanted <- enframe(sets, name = "set", value = "token") |>
  unnest(token) |>
  mutate(token = tolower(token))

# proportion per decade + list of missing words 
macho_nice_words_by_decade_df <- wanted |>
  left_join(avail, by = "token") |>
  mutate(available = tidyr::replace_na(available, FALSE)) |>
  group_by(set, decade) |>
  summarise(
    n_of_words_examined = n(),
    n_available = sum(available),
    prop_avail    = n_available / n_of_words_examined,
    missing     = list(token[!available]),
    .groups = "drop"
  ) |>
  arrange(set, decade)
kable(macho_nice_words_by_decade_df,
      caption= "Construct's word presence across time")
Construct’s word presence across time
set decade n_of_words_examined n_available prop_avail missing
macho 1800 20 10 0.5000000 dominant , assertive , competitive, aggressive , unemotional, fearless , alpha , charismatic, cocky , macho
macho 1810 20 12 0.6000000 dominant , assertive , competitive, aggressive , unemotional, charismatic, cocky , macho
macho 1820 20 12 0.6000000 dominant , assertive , competitive, aggressive , unemotional, charismatic, cocky , macho
macho 1830 20 13 0.6500000 assertive , competitive, aggressive , unemotional, charismatic, cocky , macho
macho 1840 20 13 0.6500000 assertive , competitive, aggressive , unemotional, charismatic, cocky , macho
macho 1850 20 14 0.7000000 assertive , competitive, unemotional, charismatic, cocky , macho
macho 1860 20 14 0.7000000 assertive , competitive, unemotional, charismatic, cocky , macho
macho 1870 20 15 0.7500000 assertive , unemotional, charismatic, cocky , macho
macho 1880 20 15 0.7500000 assertive , unemotional, charismatic, cocky , macho
macho 1890 20 15 0.7500000 assertive , unemotional, charismatic, cocky , macho
macho 1900 20 16 0.8000000 unemotional, charismatic, cocky , macho
macho 1910 20 16 0.8000000 unemotional, charismatic, cocky , macho
macho 1920 20 16 0.8000000 unemotional, charismatic, cocky , macho
macho 1930 20 16 0.8000000 unemotional, charismatic, cocky , macho
macho 1940 20 16 0.8000000 unemotional, charismatic, cocky , macho
macho 1950 20 16 0.8000000 unemotional, charismatic, cocky , macho
macho 1960 20 17 0.8500000 unemotional, cocky , macho
macho 1970 20 17 0.8500000 unemotional, cocky , macho
macho 1980 20 20 1.0000000
macho 1990 20 20 1.0000000
male 1800 5 5 1.0000000
male 1810 5 5 1.0000000
male 1820 5 5 1.0000000
male 1830 5 5 1.0000000
male 1840 5 5 1.0000000
male 1850 5 5 1.0000000
male 1860 5 5 1.0000000
male 1870 5 5 1.0000000
male 1880 5 5 1.0000000
male 1890 5 5 1.0000000
male 1900 5 5 1.0000000
male 1910 5 5 1.0000000
male 1920 5 5 1.0000000
male 1930 5 5 1.0000000
male 1940 5 5 1.0000000
male 1950 5 5 1.0000000
male 1960 5 5 1.0000000
male 1970 5 5 1.0000000
male 1980 5 5 1.0000000
male 1990 5 5 1.0000000
nice 1800 22 15 0.6818182 caring , sensitive , considerate, supportive , emotional , reliable , thoughtful
nice 1810 22 18 0.8181818 caring , supportive, emotional , reliable
nice 1820 22 19 0.8636364 supportive, emotional , reliable
nice 1830 22 19 0.8636364 supportive, emotional , reliable
nice 1840 22 20 0.9090909 supportive, emotional
nice 1850 22 21 0.9545455 supportive
nice 1860 22 21 0.9545455 supportive
nice 1870 22 21 0.9545455 supportive
nice 1880 22 21 0.9545455 supportive
nice 1890 22 21 0.9545455 supportive
nice 1900 22 21 0.9545455 supportive
nice 1910 22 21 0.9545455 supportive
nice 1920 22 21 0.9545455 supportive
nice 1930 22 21 0.9545455 supportive
nice 1940 22 21 0.9545455 supportive
nice 1950 22 22 1.0000000
nice 1960 22 22 1.0000000
nice 1970 22 22 1.0000000
nice 1980 22 22 1.0000000
nice 1990 22 22 1.0000000

It seems the words cocky and macho do not appear in ENGall at all. I therefore omit them to avoid adding noise.

(Consult with Almog about the unequal lengths of the nice vs. macho lists after omission: length(nice) = 22; length(macho) = 18.)

length(macho_guy_words)
[1] 20
#[1] 20

length(macho_guy_words_injected)
[1] 25
#[1] 25

length(male_anchors)
[1] 5
# [1] 5

macho_guy_words <- setdiff(macho_guy_words,c("cocky", "macho"))
macho_guy_words_injected <- setdiff(macho_guy_words_injected,c("cocky", "macho"))

# Sanity check
length(macho_guy_words)
[1] 18
# [1] 18

length(macho_guy_words_injected)
[1] 23
# [1] 23

# OK I'm sane

2 DDR

Now I will create the Macho and Nice guy contracts (DDR). Currently the Engall[[i]] (the model of a specific decade) is of type df while predict() function expects an embedding object that way we would use the to_embedding() helper:

# ---- setting the ENGall model to fit 'embedplyr' workflow





## ----------- Macho DDR (with injection) ---------- 
Macho_DDR_injected_by_decade <- list()

for (i in seq_along(ENGall_model)) {
  
  # Getting the prediction of embedding DDR of the Macho_guy words (with injection) while converting the ENGall model to ebedding object:
  
  Macho_DDR_injected_by_decade[[i]] <- predict(to_embeddings(ENGall_model[[i]]), macho_guy_words_injected)|> 
    
    #I'll average the embedding inside the loop since each DDR embedding is computed  on it's on corpus/decade-  noteworthy the averages is done by Google Trillion Word corpus which meant to weight words base on their frequency. 
    
    
    # Consult with Almog - should the weighting be done by our model of choice (ENGall) setting anchor="good"?
  average_embedding()
}

 
# Make it more readable: 

names(Macho_DDR_injected_by_decade) <- decades

## ----------- Nice DDR (with injection) ---------- 

# Follow the same process as before
Nice_DDR_injected_by_decade <- list()

for (i in seq_along(ENGall_model)) {
  
  Nice_DDR_injected_by_decade[[i]] <- predict(to_embeddings(ENGall_model[[i]]), nice_guy_words_injected)|> 
  average_embedding()
}

 
# Make it more readable: 

names(Nice_DDR_injected_by_decade) <- decades


#----------- The 'Nudge Appraoch' -------------


## ----------- Macho DDR Nudge  ---------- 


# Follow the same process as before
Macho_DDR_nudge_by_decade <- list()

for (i in seq_along(ENGall_model)) {

  Macho_DDR_nudge_by_decade[[i]] <- predict(to_embeddings(ENGall_model[[i]]), macho_guy_words)|>
  average_embedding() |>
    
    # Same as before but now I'll apply the nudge toward the masculine direction
    
    masculinity_nudge(E = ENGall_model[[i]])
}

names(Macho_DDR_nudge_by_decade) <- decades


## ----------- Nice DDR Nudge  ---------- 

Nice_DDR_nudge_by_decade <- list()

for (i in seq_along(ENGall_model)) {

  Nice_DDR_nudge_by_decade[[i]] <- predict(to_embeddings(ENGall_model[[i]]), nice_guy_words)|>
  average_embedding() |>
    
    # Same as before but now I'll apply the nudge toward the masculine direction
    
    masculinity_nudge(E = ENGall_model[[i]])
}

names(Nice_DDR_nudge_by_decade) <- decades

2.1 Cos Sims

DDRs are just points in semantic space. Next, I compute the cosine similarity of each model word to these DDRs.

Specifically, I compute cosine similarity between the Macho and Nice DDRs and only the words in the model that have valence scores—i.e., those appearing in Warriner.

(Warriner contains ~14,000 words with valence. After this examination, I repeat the analysis with the trait list—i.e., restricting to words that appear in both Warriner and the trait list.)

warriner_with_COSINE_from_DDR <- list() 

# Switching to dfm - in order to change to ds4psych workflow to make which word a feature 

# in order to do so lets switch it to corpus : 

warriner_corpus <- warriner |>
  corpus(
    text_field = "word",
     docid_field ="id")


# REMEMBER the corpus those have valence in it ! 
warriner_dfm <- warriner_corpus |>
  tokens() |>
  dfm()

# now that we have dfm we can use get_sims()



for (i in seq_along(ENGall_model)) {

  
  warriner_with_COSINE_from_DDR[[i]] <- warriner_dfm|>
    
    # Getting the embedding of the Warriner words by ENGallL:
    
    textstat_embedding(to_embeddings(ENGall_model[[i]]))  |> 
    bind_cols(docvars(warriner_corpus)) |>
    
    # Getting the Cosine_squish form Macho with nudge (remeber that its a list by decades so add [[i]])
    
    get_sims(
      V1:V300, 
      list(
        macho_nudge     = Macho_DDR_nudge_by_decade[[i]],
        macho_inject       = Macho_DDR_injected_by_decade[[i]],
        nice_nudge      = Nice_DDR_nudge_by_decade[[i]],
        nice_inject       = Nice_DDR_injected_by_decade[[i]]
      ), 
      method = "cosine_squished"
    ) |> 
    
    # adding the word for the results to be readable 
    left_join(warriner |> select(word,id),
              by= c("doc_id" ="id"))
}
names(warriner_with_COSINE_from_DDR) <- decades



#-------- NOTICE! ------

# Before the comparisons I'll omit the words that were used to create the DDR, namely, macho_guy_word and nice_guy_words, to reduce noise .

# a sanity check has reviled that the those words received higher cos sim (DUH!)
warriner_COSINE_DDR_omit<- list()
for (i in seq_along(warriner_with_COSINE_from_DDR)) {
  warriner_COSINE_DDR_omit[[i]]<- warriner_with_COSINE_from_DDR[[i]] |> 
    filter(!word %in% c(macho_guy_words, nice_guy_words))
}
names(warriner_COSINE_DDR_omit) <- decades

#Exporting  files for inspection 
# for (i in seq_along(warriner_with_COSINE_from_DDR)) {
# 
#   write_csv(
#     warriner_COSINE_DDR_omit[[i]],
#     file = paste("csv/",paste(decades[i],"decade.csv"))
#   )
# }

2.2 Deltas

#Helper - Comparison:
compare_DDRs<- function(df) {
  
  df |> 
    mutate(
      # Nudge
      delta_nudge = macho_nudge - nice_nudge,
      # Inject
      delta_inject = macho_inject - nice_inject,
    )
}


# Adding the variables to the list 
warriner_COSINE_DDR_Diff <- list() 
for (i in seq_along(warriner_COSINE_DDR_omit)) {
  warriner_COSINE_DDR_Diff[[i]]<- compare_DDRs(warriner_COSINE_DDR_omit[[i]])
}
names(warriner_COSINE_DDR_Diff) <- decades 

# Make it a long format for it to be easier to work with :


bind_with_decade <- function(dflist) {
  
  tibble(
    decade = names(dflist),
    df     = dflist
  ) |>
    mutate(df = map2(df, decade, ~ .x |>
                       mutate(decade = .y))) |>
    pull(df) |>
    bind_rows()
}

raw_long <- bind_with_decade(warriner_COSINE_DDR_Diff) |>
  mutate(
    
    # Make `decade` an ordered factor (sorted numerically)
    
    decade     = factor(decade, levels = sort(unique(as.integer(decade))) |> as.character())
  ) 

2.2.1 —- Q1 ——

Is there, overall, a difference in association across 200 years between Macho and Nice Guy for these word lists?

tt <- t.test(raw_long$macho_nudge,raw_long$nice_nudge,alternative = "two.sided") 
if(tt$p.value < 0.05) {
  print("Q1 - supports of H1")
} else {
  print("Q1 - supports of H0")
}
[1] "Q1 - supports of H1"
compare_nice_macho_words <- function (data, only_traits =F){
  
  #removing NA  by delta_nudge 
  data <- data |> 
    filter(!is.na(delta_nudge)) |>
    
    mutate(doc_id = factor(doc_id))
  # Checking id the comparison is done by traits only or all the words
  if(only_traits){
    data <-  data |> filter(doc_id %in% (traits_warriner |>
                                    select(id) |>
                                    pull()))
  }
  

  data  |>
     # THIS FUNCTION IS AT WORD LEVEL NOT DECADE! 

    group_by(doc_id) |>
    
    summarise(
      n = n(),
      # Mean
      mean_delta_nudge = mean(delta_nudge, na.rm =T ),
      mean_delta_inject = mean(delta_inject, na.rm =T ),
      
      # SD
      sd_nudge   = sd(delta_nudge, na.rm =T),
      sd_inject   = sd(delta_inject, na.rm =T),
      
      # SE
      se_nudge         = sd_nudge / sqrt(n),
      se_inject         = sd_inject  / sqrt(n),
      
      # T 
      t_nudge          = ifelse(n > 1,
                                mean_delta_nudge / se_nudge,
                                NA_real_),
      t_inject      = ifelse(n > 1,
                             mean_delta_inject / se_inject,
                             NA_real_),
      
      df         = n - 1L,
      
      # P values
      p_t_nudge        = ifelse(n > 1,
                                2 * pt(-abs(t_nudge), df),
                                NA_real_),
      p_t_inject        = ifelse(n > 1,
                                 2 * pt(-abs(t_inject), df),
                                 NA_real_),
      
      
      #Cohen's D:  mean of Δ divided by SD of Δ
      d_z_nudge        = mean_delta_nudge / sd_nudge,
      d_z_inject       = mean_delta_inject / sd_inject,
      
      # Confidence intervals
      ci_low_nudge      = ifelse(n > 1,
                                 mean_delta_nudge + qt(0.025, df) * se_nudge,
                                 NA_real_),
      ci_high_nudge    = ifelse(n > 1,
                                mean_delta_nudge + qt(0.975, df) * se_nudge,
                                NA_real_),
      ci_low_inject      = ifelse(n > 1,
                                  mean_delta_inject + qt(0.025, df) * se_inject,
                                  NA_real_),
      ci_high_inject     = ifelse(n > 1,
                                  mean_delta_inject  + qt(0.975, df) * se_inject,
                                  NA_real_),
      
      # Significance
      
      sign_nudge = case_when(
        p_t_nudge < 0.001 ~ "***",
        p_t_nudge < 0.01  ~ "**",
        p_t_nudge < 0.05  ~ "*",
        TRUE        ~ "No"
      ),
      sign_inject  = case_when(
        p_t_inject < 0.001 ~ "***",
        p_t_inject < 0.01  ~ "**",
        p_t_inject < 0.05  ~ "*",
        TRUE        ~ "No"
      ),
      .groups = "drop"
      ) |>
        left_join(raw_long |> select(doc_id,
                                     word,
                                     valence),
                  by= "doc_id")|>
        # Staying on word level - row = words
        distinct()
    
}






####----- GENERAL COMPARISON -----


# NOTE: This we aren't not with the function because the grouping here is done by decade and not word:

summ_t_wtnin_dcds_overall <- raw_long|>
  #removing na  by delta_nudge 
  
  filter(!is.na(delta_nudge)) |>
  
  group_by(decade)|>
  summarise(
    n = n(),
    # Mean
    mean_delta_nudge = mean(delta_nudge, na.rm =T ),
    mean_delta_inject = mean(delta_inject, na.rm =T ),
    
    # SD
    sd_nudge   = sd(delta_nudge, na.rm =T),
    sd_inject   = sd(delta_inject, na.rm =T),
    
    # SE
    se_nudge         = sd_nudge / sqrt(n),
    se_inject         = sd_inject  / sqrt(n),
  
    # T 
    t_nudge          = ifelse(n > 1,
                              mean_delta_nudge / se_nudge,
                              NA_real_),
    t_inject      = ifelse(n > 1,
                           mean_delta_inject / se_inject,
                           NA_real_),

    df         = n - 1L,
    
    # P values
    p_t_nudge        = ifelse(n > 1,
                              2 * pt(-abs(t_nudge), df),
                              NA_real_),
    p_t_inject        = ifelse(n > 1,
                              2 * pt(-abs(t_inject), df),
                              NA_real_),
    
    
    #Cohen's D:  mean of Δ divided by SD of Δ
    d_z_nudge        = mean_delta_nudge / sd_nudge,
    d_z_inject       = mean_delta_inject / sd_inject,
    
    # Confidence intervals
    ci_low_nudge      = ifelse(n > 1,
                               mean_delta_nudge + qt(0.025, df) * se_nudge,
                               NA_real_),
    ci_high_nudge    = ifelse(n > 1,
                              mean_delta_nudge + qt(0.975, df) * se_nudge,
                              NA_real_),
    ci_low_inject      = ifelse(n > 1,
                               mean_delta_inject + qt(0.025, df) * se_inject,
                               NA_real_),
    ci_high_inject     = ifelse(n > 1,
                              mean_delta_inject  + qt(0.975, df) * se_inject,
                              NA_real_),
    
    # Significance
    
    sign_nudge = case_when(
      p_t_nudge < 0.001 ~ "***",
      p_t_nudge < 0.01  ~ "**",
      p_t_nudge < 0.05  ~ "*",
      TRUE        ~ "No"
    ),
    sign_inject  = case_when(
      p_t_inject < 0.001 ~ "***",
      p_t_inject < 0.01  ~ "**",
      p_t_inject < 0.05  ~ "*",
      TRUE        ~ "No"
    ),
    .groups = "drop"
  )

summ_t_inject <- summ_t_wtnin_dcds_overall |>
  select(-contains("_nudge"))
# NOT SIGNIFICANT - 1870

summ_t_nudge <- summ_t_wtnin_dcds_overall |>
  select(-contains("_inject"))

# NOT SIGNIFICANT - 1980
kable (summ_t_wtnin_dcds_overall |> select(decade,
                                        sign_nudge,
                                        sign_inject,
                                        p_t_nudge,
                                        p_t_inject),
       caption = "Significance test of the constructs throught the decades")
Significance test of the constructs throught the decades
decade sign_nudge sign_inject p_t_nudge p_t_inject
1800 *** ** 0.0000000 0.0017251
1810 *** ** 0.0000000 0.0043208
1820 *** *** 0.0000000 0.0000005
1830 *** *** 0.0000000 0.0000005
1840 *** *** 0.0000000 0.0000000
1850 *** *** 0.0000000 0.0000000
1860 *** *** 0.0000000 0.0000000
1870 ** No 0.0029673 0.6178121
1880 *** *** 0.0000000 0.0000181
1890 *** *** 0.0000000 0.0000000
1900 *** *** 0.0000000 0.0000128
1910 *** ** 0.0000000 0.0081183
1920 *** *** 0.0000000 0.0000000
1930 *** ** 0.0000000 0.0024509
1940 *** *** 0.0000000 0.0000000
1950 *** *** 0.0000000 0.0000000
1960 *** *** 0.0000000 0.0000000
1970 *** *** 0.0000000 0.0000000
1980 No * 0.0945198 0.0406543
1990 ** *** 0.0053125 0.0000000

Overall, in most decades and across both DDR techniques, there is a significant difference in the association of words with Macho versus Nice. This mainly serves as a validation that our two constructs capture distinct semantics.

A win is a Win!

2.2.2 —- Q2——-

Across all decades, which words or traits are most associated with each DDR (testing significance once overall)?

words_t_summ<- compare_nice_macho_words (raw_long)
  
## ------  Top 20 Words------
### ----- Macho --------------
top_20_words_macho_nudge_overall <- words_t_summ |> 
  # Sign only
  filter(!sign_nudge == "No") |> 
  # arranging to but the standardized delta across decades 
  arrange(
    desc(d_z_nudge)
  ) |>  
  
  # fetching the to 20
  slice_head(n = 20)     




top_20_words_macho_inject_overall <- words_t_summ |> 
  # Sign only
  filter(!sign_inject == "No") |> 
  # arranging to but the standardized delta across decades 
  arrange(
    desc(d_z_inject)
  ) |>  
  
  # fetching the to 20
  slice_head(n = 20)     

### ----- Nice  --------------
top_20_words_nice_nudge_overall <- words_t_summ |> 
  # Sign only
  filter(!sign_nudge == "No") |> 
  # arranging to but the standardized delta across decades 
  arrange(
    # Notice this Time I want to get the lowest (minus) since delta is macho minus nice negative value indicate stronger association toward nice 
    d_z_nudge
  ) |>  
  
  # fetching the to 20
  slice_head(n = 20)     

top_20_words_nice_inject_overall <- words_t_summ |> 
  # Sign only
  filter(!sign_inject == "No") |> 
  # arranging to but the standardized delta across decades 
  arrange(
    # Notice this Time I want to get the lowest (minus) since delta is macho minus nice negative value indicate stronger association toward nice 
    d_z_inject
  ) |>  
  
  # fetching the to 20
  slice_head(n = 20)     
# A tibble: 20 × 23
   doc_id     n mean_delta_nudge mean_delta_inject  sd_nudge sd_inject  se_nudge
   <chr>  <int>            <dbl>             <dbl>     <dbl>     <dbl>     <dbl>
 1 13133      2         -0.0113           -0.0145  0.0000819  0.00128  0.0000579
 2 1545       2         -0.0333           -0.0415  0.000272   0.00205  0.000192 
 3 10675      2         -0.0223           -0.0282  0.000388   0.000498 0.000275 
 4 7969       2         -0.0260           -0.0339  0.000782   0.000947 0.000553 
 5 7228       2         -0.0235           -0.0252  0.00168    0.00381  0.00119  
 6 10874      2         -0.0101           -0.00562 0.000743   0.00497  0.000526 
 7 2879       2         -0.0278           -0.0364  0.00229    0.00600  0.00162  
 8 13714      2         -0.0267           -0.0322  0.00255    0.00319  0.00180  
 9 6440       3         -0.0615           -0.0801  0.00596    0.00617  0.00344  
10 8279       3         -0.0357           -0.0431  0.00360    0.00502  0.00208  
11 11034      2         -0.0262           -0.0323  0.00270    0.00534  0.00191  
12 2815      20         -0.140            -0.157   0.0145     0.0147   0.00324  
13 3920       2         -0.0634           -0.0732  0.00665    0.00710  0.00470  
14 6823      20         -0.132            -0.161   0.0140     0.0172   0.00314  
15 11059      4         -0.0512           -0.0534  0.00545    0.00533  0.00273  
16 1026       2         -0.00787          -0.00954 0.000853   0.00495  0.000603 
17 4854       2         -0.0175           -0.0146  0.00190    0.000189 0.00135  
18 5693      20         -0.0904           -0.0919  0.00993    0.0109   0.00222  
19 11197     20         -0.125            -0.142   0.0140     0.0169   0.00313  
20 5136      20         -0.106            -0.113   0.0125     0.0224   0.00281  
# ℹ 16 more variables: se_inject <dbl>, t_nudge <dbl>, t_inject <dbl>,
#   df <int>, p_t_nudge <dbl>, p_t_inject <dbl>, d_z_nudge <dbl>,
#   d_z_inject <dbl>, ci_low_nudge <dbl>, ci_high_nudge <dbl>,
#   ci_low_inject <dbl>, ci_high_inject <dbl>, sign_nudge <chr>,
#   sign_inject <chr>, word <chr>, valence <dbl>
macho_table_all_words_fig
Top 10 Macho (vs. Nice) Words
cohenD_inject words_inject cohenD_nudge word_nudge rank
255.456 launcher 129.888 outwit 1
106.013 striker 82.791 launcher 2
50.104 safari 72.212 safari 3
40.676 celibate 40.461 striker 4
39.298 outwit 32.563 info 5
24.180 eggshell 30.688 panda 6
20.513 midlife 26.554 jumper 7
17.765 superpower 23.781 superpower 8
17.193 payload 22.951 eggshell 9
15.832 gloat 21.065 tarmac 10
15.792 temp 19.816 midlife 1
14.260 neutron 15.724 chic 2
13.823 swank 15.605 crossover 3
13.449 anemic 15.191 celibate 4
13.341 panda 14.990 gloat 5
12.716 perm 14.817 neutron 6
11.592 tarmac 14.147 perm 7
11.590 freckle 12.591 anemic 8
10.637 commando 12.475 rhino 9
10.459 keypad 11.662 commando 10
nice_table_all_words_fig
Top 10 Nice (vs. Macho) Words
cohenD_inject words_inject cohenD_nudge word_nudge rank
-77.261 forevermore -137.490 unlisted 1
-56.663 sauerkraut -122.791 bubbly 2
-35.783 mozzarella -57.336 sauerkraut 3
-27.431 doornail -33.173 mozzarella 4
-20.252 bubbly -13.963 lopsided 5
-12.978 insightful -13.559 seedy 6
-11.348 unlisted -12.153 creamed 7
-10.708 courteous -10.474 wiggle 8
-10.309 easygoing -10.313 insightful 9
-10.081 wiggle -9.909 nuance 10
-10.015 shit -9.702 sherbet 1
-9.318 kindness -9.666 courteous 2
-8.584 nuance -9.533 easygoing 3
-8.427 sincere -9.431 kindness 4
-8.421 hearted -9.396 shit 5
-8.270 grateful -9.219 beep 6
-7.894 affection -9.173 forevermore 7
-7.874 compassion -9.104 hearted 8
-7.777 remembrance -8.940 sincere 9
-7.709 agreeable -8.434 generous 10

Thought. We should consider an inclusion criterion that retains only words appearing in ≥ ~5 decades, since many “Top-20” items occur in only a few decades.

table(top_20_words_nice_nudge_overall$n)

 2  3  4 20 
12  2  1  5 
# [1] 2 || 3 || 4 || 20 
# [2] 12|| 2 || 1 || 5 
table(top_20_words_macho_nudge_overall$n)

 2  3  7 
15  4  1 
# [1] 2 ||  3 ||  7 
# [2]15 ||  4 ||  1 

For traits, there are two options:

  1. Run the statistical test on the traits_warriner data frame so that the SD is calculated from trait-level deltas; or

  2. Filter words_t_summ to only those in traits_warriner, which computes SD over the entire word list.

Option (1) seems preferable.

Note: In warriner_COSINE_DDR_omit we removed the terms used to create the DDRs, so none of those (macho_guy_words / nice_guy_words) should appear in the trait list.

## ------  Top 10 traits------




traits_t_summ<- compare_nice_macho_words(raw_long,only_traits = T)

#### ------ Macho -------
top_10_traits_macho_nudge_overall <- traits_t_summ |> 
  # Sign only
  filter(!sign_nudge == "No") |> 
  # arranging to but the standardized delta across decades 
  arrange(
    desc(d_z_nudge)
  ) |>  
  
  # fetching the top 10
  slice_head(n = 10)    |> 
  mutate(rank = row_number())


top_10_traits_macho_inject_overall <- traits_t_summ |> 
  # Sign only
  filter(!sign_inject == "No") |> 
  # arranging to but the standardized delta across decades 
  arrange(
    desc(d_z_inject)
  ) |>  
  
  # fetching the top 10
  slice_head(n = 10)        |> 
  mutate(rank = row_number())

### ----- Nice --------------
top_10_traits_nice_nudge_overall <- traits_t_summ |> 
  # Sign only
  filter(!sign_nudge == "No") |> 
  # arranging to but the standardized delta across decades 
  arrange(
    # Notice this Time I want to get the lowest (minus) since delta is macho minus nice negative value indicate stronger association toward nice 
    d_z_nudge
  ) |>  
  
  # fetching the to 20
  slice_head(n = 10)       |> 
  mutate(rank = row_number()) 

top_10_traits_nice_inject_overall <- traits_t_summ |> 
  # Sign only
  filter(!sign_inject == "No") |> 
  # arranging to but the standardized delta across decades 
  arrange(
    # Notice this Time I want to get the lowest (minus) since delta is macho minus nice negative value indicate stronger association toward nice 
    d_z_inject
  ) |>  
  
  # fetching the top 10
  slice_head(n = 10)        |> 
  mutate(rank = row_number())

Insight. The “small n” problem (traits appearing in only a few decade models) is less severe here; most traits appear more than 10 times overall across models.

### ---- Tabling-----

nice_table_all<- data.frame(
  cohenD_inject = top_10_traits_nice_inject_overall$d_z_inject,
  words_inject = top_10_traits_nice_inject_overall$word,
  cohenD_nudge = top_10_traits_nice_nudge_overall$d_z_nudge,
  word_nudge = top_10_traits_nice_nudge_overall$word,
  rank= c(1:10)

)
nice_all_traits_table<- kable(
  nice_table_all,
  caption = "Top 10 Nice (vs. Macho) Traits",
  digits = 3
)

macho_table_all<- data.frame(
  cohenD_inject = top_10_traits_macho_inject_overall$d_z_inject,
  words_inject = top_10_traits_macho_inject_overall$word,
  cohenD_nudge = top_10_traits_macho_nudge_overall$d_z_nudge,
  word_nudge = top_10_traits_macho_nudge_overall$word,
  rank= c(1:10)

)
macho_table_all_table<- kable(
  macho_table_all,
  caption = "Top 10 Macho (vs. Nice) Traits",
  digits = 3
)
macho_table_all_table
Top 10 Macho (vs. Nice) Traits
cohenD_inject words_inject cohenD_nudge word_nudge rank
4.886 conservative 3.935 enterprising 1
4.358 manipulative 3.771 conservative 2
3.811 active 3.359 defensive 3
3.663 enterprising 3.165 manipulative 4
3.394 defensive 3.031 indecisive 5
3.364 forceful 2.787 ruthless 6
3.177 brilliant 2.774 belligerent 7
3.158 weak 2.543 brilliant 8
3.150 indecisive 2.502 objective 9
3.123 energetic 2.493 forceful 10
nice_all_traits_table
Top 10 Nice (vs. Macho) Traits
cohenD_inject words_inject cohenD_nudge word_nudge rank
-12.978 insightful -10.313 insightful 1
-10.708 courteous -9.666 courteous 2
-10.309 easygoing -9.533 easygoing 3
-8.427 sincere -8.940 sincere 4
-7.709 agreeable -8.434 generous 5
-7.022 friendly -6.955 agreeable 6
-6.485 gracious -6.732 friendly 7
-5.744 pleasant -6.718 gracious 8
-5.698 compassionate -6.696 patient 9
-5.543 patient -6.659 helpful 10

2.2.3 —- Q3 —-

Over time, does the valence of the words/traits most associated with one construct (vs. the other) change?

For each decade, I identify the Top-10 most related words and compute their mean valence.

To make interpretation more intuitive, I rescale valence from the original range to −4…+4, where negative values indicate negative valence and positive values indicate, well, positive valence.

# --------- WITHIN DECADES------------ 

#----- Q3 ----


get_top_valence <- function(data,
                            decade,
                            is.nudge= F,
                            only_traits = F,
                            macho_vs = T) {
  # Words or Traits ? 
  if(only_traits){
    data <-  data |> filter(doc_id %in% (traits_warriner |>
                                           select(id) |>
                                           pull()))
  }
  # Nudge or Inject
  col <- if (is.nudge) "delta_nudge" else "delta_inject"
  
  
  # If macho_vs is False meaning we want the words that are more associated to nice (vs macho) therefore assenting order of delta since delta is macho MINUS nice 
  dir <- if (macho_vs) -1 else 1
  data <- data[order(dir * data[[col]]), ]
  data  |>
    slice_head(n = 10) |>
    summarise(
      mean_valence = mean(valence, na.rm = TRUE),
      decade = as.numeric(decade)
    )
}




# the list of 20 df with deltas: warriner_COSINE_DDR_Diff

### ------- Macho ------
#  Words Inject

valence_words_macho_decades_inject<-  map2_dfr(
  warriner_COSINE_DDR_Diff, decades,
  ~ get_top_valence(.x,
                    decade = .y)
)|>
  mutate(mean_valence =  mean_valence -4)

  # Words Nudge

valence_words_macho_decades_nudge <- map2_dfr(
  warriner_COSINE_DDR_Diff,
  decades,
  ~ get_top_valence(.x,
                    decade = .y,
                    is.nudge = TRUE)
) |>
  mutate(mean_valence =  mean_valence -4)

  # Traits Inject
valence_traits_macho_decades_inject <- map2_dfr(
  warriner_COSINE_DDR_Diff,
  decades,
  ~ get_top_valence(.x,
                    decade = .y,
                    only_traits = TRUE)
) |>
  mutate(mean_valence =  mean_valence -4)

  
   # Traits nudge

valence_traits_macho_decades_nudge <- map2_dfr(
  warriner_COSINE_DDR_Diff,
  decades,
  ~ get_top_valence(.x,
                    decade = .y,
                    only_traits = TRUE,
                    is.nudge = TRUE)
) |>
  mutate(mean_valence =  mean_valence -4)
  
## -------- Nice 
#  Words Inject

valence_words_nice_decades_inject<-  map2_dfr(
  warriner_COSINE_DDR_Diff, decades,
  ~ get_top_valence(.x,
                    decade = .y,
                    macho_vs = F)
) |>
  mutate(mean_valence =  mean_valence -4)

# Words Nudge
valence_words_nice_decades_nudge <- map2_dfr(
  warriner_COSINE_DDR_Diff,
  decades,
  ~ get_top_valence(.x,
                    decade = .y,
                    is.nudge = TRUE,
                    macho_vs = F)
) |>
  mutate(mean_valence =  mean_valence -4)

# Traits Inject
valence_traits_nice_decades_inject <- map2_dfr(
  warriner_COSINE_DDR_Diff,
  decades,
  ~ get_top_valence(.x,
                    decade = .y,
                    only_traits = TRUE,
                    macho_vs = F)
) |>
  mutate(mean_valence =  mean_valence -4)

# Traits nudge

valence_traits_nice_decades_nudge <- map2_dfr(
  warriner_COSINE_DDR_Diff,
  decades,
  ~ get_top_valence(.x,
                    decade = .y,
                    only_traits = TRUE,
                    is.nudge = TRUE,
                    macho_vs = F)
) |>
  mutate(mean_valence =  mean_valence -4)










# ----- Modeling Macho ----

# Word inject - Macho 
lm_macho_words_inject <- lm(mean_valence ~ decade, data = valence_words_macho_decades_inject)

# Word nudge - Macho 
lm_macho_words_nudge <- lm(mean_valence ~ decade, data = valence_words_macho_decades_nudge)

# Trait inject - Macho 
lm_macho_traits_inject <- lm(mean_valence ~ decade, data = valence_traits_macho_decades_inject)

# Trait nudge - Macho 
lm_macho_traits_nudge <- lm(mean_valence ~ decade, data = valence_traits_macho_decades_nudge)



# --------- Modeling Nice -------- 

# Word inject - Nice 
lm_nice_words_inject <- lm(mean_valence ~ decade, data = valence_words_nice_decades_inject)

# Word nudge - Nice
lm_nice_words_nudge <- lm(mean_valence ~ decade, data = valence_words_nice_decades_nudge)

# Trait inject - Nice 
lm_nice_traits_inject <- lm(mean_valence ~ decade, data = valence_traits_nice_decades_inject)

# Trait nudge -Nice 
lm_nice_traits_nudge <- lm(mean_valence ~ decade, data = valence_traits_nice_decades_nudge)

Voila!!

#Attractiveness In this section, I constructed a decade-specific Attractiveness DDRs and then examine how semantically close the macho and nice representations are to attractiveness across time.

The goal is to quantify how the semantic association between Masochistic and gentle masculinity and attractiveness has shifted across historical decades.

2.3 Word list

We begin by specifying a set of core adjectives that directly express attractiveness (e.g., attractive, beautiful, handsome, sexy, etc.). Two versions are created:

  • Nudge version: uses only the attractiveness descriptors.

  • Inject version: includes the attractiveness descriptors plus the male anchor words, ensuring a stronger masculine reference frame in the embedding space.

This parallels the earlier (Macho/Nice) pipeline for consistency.

attractiveness_words <- c(
  "attractive",
  "beautiful",
  "handsome",
  "gorgeous",
  "pretty",
  "sexy",
  "cute",
  "alluring",
  "charming",
  "appealing"
)

# injected
attractiveness_words_injected <- attractiveness_words|>
  c(male_anchors)

For each historical decade represented in the ENGall embedding model, we compute an Attractiveness DDR using the same procedure applied previously to Macho and Nice:

Inject condition

  1. Retrieve the embeddings of all attractiveness words + male anchors.
  2. Compute their average embedding, yielding the Attractiveness Injected DDR for that decade.

Nudge condition

  1. Retrieve the embeddings of the attractiveness adjectives.
  2. Average them to obtain the raw Attractiveness Nudge DDR.
  3. Apply the masculinity_nudge() transformation to align the DDR with the nudge paradigm used for the Macho/Nice DDRs

Each DDR represents a single semantic vector summarizing the meaning of attractiveness in that decade.

# ---- setting the ENGall model to fit 'embedplyr' workflow
## ----------- Attractiveness DDR (with injection) ----------
Attractiveness_DDR_injected_by_decade <- list()
for (i in seq_along(ENGall_model)) {
  
  Attractiveness_DDR_injected_by_decade[[i]] <- predict(to_embeddings(ENGall_model[[i]]), attractiveness_words_injected)|> 
    average_embedding()
}
names(Attractiveness_DDR_injected_by_decade) <- decades
## ----------- Attractiveness DDR Nudge  ----------
Attractiveness_DDR_nudge_by_decade <- list()
for (i in seq_along(ENGall_model)) {
  Attractiveness_DDR_nudge_by_decade[[i]] <- predict(to_embeddings(ENGall_model[[i]]), attractiveness_words)|>
  average_embedding() |>
    masculinity_nudge(E = ENGall_model[[i]])
}
names(Attractiveness_DDR_nudge_by_decade) <- decades

2.4 Cosine Similarity from Attractiveness DDRs

# Helper function to compute cosine similarity from Attractiveness DDRs

# -------------------------------
# Helper: pure cosine similarity
# -------------------------------
cosine_sim <- function(v1, v2) {
  sum(v1 * v2) / (sqrt(sum(v1^2)) * sqrt(sum(v2^2)))
}
  
# List to store results
cosine_from_attractiveness_nudge <- list()
cosine_from_attractiveness_inject <- list()
for (i in seq_along(ENGall_model)) {
  
  # computing the Cosim - using the built in function textSimilarity(v1, v2)
  # Nudge:
  cosine_from_attractiveness_nudge[[i]] <- data.frame(
    sims_macho_attract=cosine_sim(Macho_DDR_nudge_by_decade[[i]],Attractiveness_DDR_nudge_by_decade[[i]]),
    sims_nice_attract=
    cosine_sim(Nice_DDR_nudge_by_decade[[i]],
                             Attractiveness_DDR_nudge_by_decade[[i]]),
    i_decade= i
  )

  # Inject:
  cosine_from_attractiveness_inject[[i]] <- data.frame(
    sims_macho_attract=cosine_sim(Macho_DDR_injected_by_decade[[i]],Attractiveness_DDR_injected_by_decade[[i]]),
    sims_nice_attract=cosine_sim(Nice_DDR_injected_by_decade[[i]],
                             Attractiveness_DDR_injected_by_decade[[i]]),
    i_decade= i
  )
}
names(cosine_from_attractiveness_nudge) <- decades
names(cosine_from_attractiveness_inject) <- decades
# Binding the results into a data frame for plotting
cosine_attract_nudge_df <- bind_rows(cosine_from_attractiveness_nudge) |>
  mutate(
    decade = decades[i_decade]
  ) |>
  select(-i_decade)
cosine_attract_inject_df <- bind_rows(cosine_from_attractiveness_inject) |>
  mutate(
    decade = decades[i_decade]
  ) |>
  select(-i_decade)

#write a ggplot to show the results
# Nudge plot
nudge_attr_plot <- ggplot(cosine_attract_nudge_df, aes(x = decade)) +
  geom_line(aes(y = sims_macho_attract, color = "Macho")) +
  geom_line(aes(y = sims_nice_attract, color = "Nice")) +
  labs(
    title = "Cosine Similarity from Attractiveness DDR (Nudge)",
    x = "Decade",
    y = "Cosine Similarity",
    color = "Construct"
  ) +
  theme_minimal()
# Inject plot
inject_attr_plot <- ggplot(cosine_attract_inject_df, aes(x = decade)) +
  geom_line(aes(y = sims_macho_attract, color = "Macho")) +
  geom_line(aes(y = sims_nice_attract, color = "Nice")) +
  labs(
    title = "Cosine Similarity from Attractiveness DDR (Inject)",
    x = "Decade",
    y = "Cosine Similarity",
    color = "Construct"
  ) +
  theme_minimal()

# Nudge t-test  
tt_nudge <- t.test(cosine_attract_nudge_df$sims_macho_attract,
                    cosine_attract_nudge_df$sims_nice_attract,
                    alternative = "two.sided")
if(tt_nudge$p.value < 0.05) {
  print("Attractiveness Nudge - supports of H1")
} else {
  print("Attractiveness Nudge - supports of H0")
}
[1] "Attractiveness Nudge - supports of H1"
# Inject t-test
tt_inject <- t.test(cosine_attract_inject_df$sims_macho_attract,
                  cosine_attract_inject_df$sims_nice_attract,
                  alternative = "two.sided")
if(tt_inject$p.value < 0.05) {
  print("Attractiveness Inject - supports of H1")
} else {
  print("Attractiveness Inject - supports of H0")
}
[1] "Attractiveness Inject - supports of H1"
# show the resultin apa table 
attractiveness_cosim_table <- data.frame(
  DDR_Type = c("Nudge", "Inject"),
  t_value = c(tt_nudge$statistic, tt_inject$statistic),
  df = c(tt_nudge$parameter, tt_inject$parameter),
  p_value = c(tt_nudge$p.value, tt_inject$p.value),
  mean_macho_cosim = c(mean(cosine_attract_nudge_df$sims_macho_attract),
                       mean(cosine_attract_inject_df$sims_macho_attract)),
  mean_nice_cosim = c(mean(cosine_attract_nudge_df$sims_nice_attract),
                      mean(cosine_attract_inject_df$sims_nice_attract))
)
kable(
  attractiveness_cosim_table,
  caption = "T-test results for Cosine Similarity from Attractiveness DDR",
  digits = 3
)
T-test results for Cosine Similarity from Attractiveness DDR
DDR_Type t_value df p_value mean_macho_cosim mean_nice_cosim
Nudge -2.099 34.721 0.043 0.606 0.647
Inject -5.103 35.973 0.000 0.482 0.581

Across decades, the Nice DDR was consistently more semantically associated with attractiveness than the Macho DDR. In the nudge condition, Nice DDRs showed significantly higher cosine similarity to attractiveness than Macho DDRs, t(34.72) = −2.10, p = .043 (M_nice = .647, M_macho = .606). This pattern was even stronger in the inject condition, where Nice DDRs again exhibited greater proximity to attractiveness, t(35.97) = −5.10, p < .001 (M_nice = .581, M_macho = .482).

Together, these results indicate that across historical decades, the semantic space positions “nice” traits closer to attractiveness than “macho” traits, under both nudge and inject paradigms.